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ABSTRACT  Many  important  strategic  applications  require  access  to  and 
integration  of  disparate  information  resources.  We  refer  to  this  category  of 
information  systems  as  Composite  Information  Systems  (CIS).  Issues  involved  in 
developing  a  CIS  are  categorized  and  exemplified  to  reveal  the  deficiencies  of 
current  practice.  A  methodology  which  incorporates  Data  Base  Management 
Systems  and  Artificial  Intelligence  techniques  is  presented.  Furthermore,  an 
operational  Tool  Kit  for  CIS  (CIS/TK)  has  been  implemented.  The  CIS/TK 
ensemble  consists  of  four  components:  knowledge  processing,  information 
processing,  physical  and  logical  connectivity,  and  interface  tools.  It  is  an 
innovative  system  for  delivering  timely  knowledge  and  information  in  an  inter- 
organizational  setting.  In  the  rapidly  changing,  complex,  and  competitive  global 
market,  the  capability  to  dynamically  align  corporate  strategy  with  information 
technology  in  the  organizational  context  is  a  critical  issue  facing  the  executive. 
This  methodology  and  prototype  system  are  specifically  aimed  at  providing  a 
dynamic  platform  for  supporting  such  knowledge  and  information  intensive 
applications. 


KEY  WORDS  AND  PHRASES:  composite  information  systems,  cooperative 
systems,  distributed  databases,  organizational  information  systems,  systems 
development. 
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Connectivity  Among  Information  Systems 

INTRODUCTION 

The  rapidly  increasing  complexity,  interdependence,  and  competition  in  the 
global  market  has  profoundly  impacted  how  corporations  operate  and  how  they  align 
their  information  technology  (IT)  for  competitive  advantage  in  the  marketplace. 
This  alignment  has  accelerated  demands  for  more  effective  information 
management  for  decision-making,  operational  efficiency,  and  new  product  and 
services  [5,  21].  Meanwhile,  the  computing  industry  has  made  significant  advances 
in  the  price,  speed  performance,  capacity,  and  capabilities  of  many  information 
technologies  over  the  last  two  decades.  A  key  concern  facing  corporations  today  is 
how  to  make  mostefiective  use  of  IT  to  meet  their  needs  [12, 19]. 

For  example,  an  on-going  dialogue  exists  between  the  MIT  Sloan  School  of 
Management  and  the  sponsors  of  the  Management  in  the  1990's  research  program.  In 
a  recent  meeting  with  the  sponsors'  to  assess  their  information  technology 
requirements  for  1995,  a  critical  need  was  identified  for  developing  systems  that  can 
provide  access  to,  and  integration  of  their  corporations'  numerous  information 
systems,  as  depicted  in  Figure  1. 

A  CIS   APPLICATION:   PAS 

It  has  become  increasingly  evident  that  many  important  applications  require 
such  access  to  and  integration  of  multiple  disparate  databases  both  within  and 


1  The  meeting  was  held  at  the  MIT  Sloan  School  in  the  late  April,  1988  The  participants  were  IT 
executives  from  American  Express,  Arthur  Young  &  Co  ,  Bell  South,  British  Petroleum,  Cigna 
Insurance,  Digital  Equipment  Corporation  (DEC),  International  Computers,  Ltd,,  Kodak,  IRS, 
and  US.  Army. 


Figure  1  A  Perspective  of  the  Information  Technology  Requirements  for  1995 

across  organizational  boundaries  in  order  to  increase  productivity  [4,  9].  We  refer  to 
this  type  of  systems  as  Composite  Information  Systems  (CIS)  [11,  16,  20,24]. 

Consider  the  Placement  Assistant  System  (PAS),  depicted  in  Figure  2,  which  is 
being  developed  for  the  MIT  Sloan  Placement  Office.  PAS  spans  five  information 
systems  in  four  organizations:  (1)  the  student  database  and  the  interview 
schedule  database  are  located  in  the  Sloan  School;  (2)  the  alumni  database  is 
located  in  the  MIT  alumni  office;  (3)  the  recent  news  is  accessed  by  dialing  into 
Reuter's  Textline  database;  and  (4)  the  recent  financial  information  is  accessed 
through  LP.  Sharp's  Disclosure  II  database. 

An  interesting  query  for  PAS  to  handle  would  be  to  "find  companies  interviewing 

at  Sloan" 

•  that  are  auto  manufacturers, 

•  the  students  from  these  companies, 

•  the  alumni  from  these  companies, 

•  recent  financial  information,  and 

•  recent  neu;s  about  these  companies. 
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Figure  2    A  CIS  for  the  Sloan  Placement  OfTice 


This  information  would  be  very  valuable  to  a  student  interested  in  the 
automobile  industry.  Current  students  from  these  companies  can  ofTer  first-hand 
insider  information.  The  alumni  from  these  companies  may  be  able  to  "put  in  a  good 
word"  on  his  behalf.  Recent  financial  information  indicates  the  economic 
environment  at  that  company  as  well  as  providing  information  that  may  be  helpful 
during  an  interview.  Finally,  recent  news  will  keep  the  student  abreast  of  what  is 
going  on  in  that  company  as  well  as  to  be  well  prepared  for  the  interview  with  the 
recruiters. 

Figure  2  depicts  a  partial  menu  for  the  query  and  its  corresponding  answer  in  the 
case  of  Chrysler.  Many  research  problems  need  to  be  solved  in  order  to  obtain  the 
composite  information  for  this  query.  These  problems  can  be  categorized  into  first- 
and  second-order  issues,  as  discussed  below. 

CONNECTIVITY  ISSUES 

First  Order  Issues 

The  first-order  issues  are  encountered  immediately  when  attempting  to  provide 
access  to  and  integration  of  multiple  information  resources: 

•  multi-vendor  machines  (IBM  PC/RT,  IBM  4341,  AT&T  3B2,  etc.) 

•  physical  connection  (Ethernet,  wide-area  net,  etc.) 

•  different  databases  (ORACLE_SQL,  IBM's  SQL/DS,  fiat  files) 

•  information  composition  (formating) 

The  issues  of  multiple  vendor  machines  and  physical  communication  are 
inherent  as  long  as  information  resources  are  dispersed  across  geographic  locations, 
be  they  intra-  or  inter-organizational.  For  example,  the  Sloan  recruiting  database  is 


implemented  in  an  IBM  PCIRT  computer  whereas  the  Sloan  alumni  database  is 
stored  in  an  AT&T  3B2  computer.  Communication  protocols  need  to  be  established 
(e.g.,  TCP/IP  LAN)  between  different  machines  for  encapsulating  the  machine 
idiosyncrasies. 

Assuming  that  hardware  idiosyncrasies  and  networking  problems  are  resolved, 
the  next  hurdle  is  the  idiosyncrasies  of  different  databases.  For  example,  the 
recruiting  database  is  developed  in  the  ORACLE  relational  database,  thus  accessed 
through  SQL  type  queries;  whereas  LP.  Sharp's  Disclosure  11  financial  database  is 
accessed  through  a  menu  driven  interface.  Different  query  commands  and  the 
corresponding  skill  are  required  in  order  to  obtain  the  information  available  from 
these  various  information  resources. 

Second  Order  Issues 

Suppose  that  one  is  able  to  resolve  the  above  problems,  he  will  nevertheless 
encounter  the  information  composition  task  which  abounds  with  second-order  issues 
such  as: 

•  database  navigation  (where  is  the  data  for  alumni  position,  base  salary,  etc.) 

•  attribute  naming  {company  attribute  vs.  comp name  attribute) 

•  simple  domain  value  mapping  ($,  y,  and  £  ) 

•  instance  identification  problem  {IBM  Corp  in  one  database  vs.  IBM  in  another 
database^ 

Database  navigation  is  needed  in  order  to  determine  which  database  to  access  to 
get  the  required  information.  Furthermore,  on  a  menu-driven  database,  e.g., 
Reuter's  Textline,  it  is  important  to  know  which  menu  path  to  access  in  order  to  save 
not  only  time  but  also  access  cost.   Similarly,  in  a  relational  database  system,  it  is 


necessary  tx)  know  in  which  tables  the  required  data  is  located  (e.g.,  alumni  position, 
company  name)  so  that  appropriate  SQL  queries  can  be  formulated. 

Entity  and  attribute  names  may  be  termed  difTerently  among  databases,  such  as 
company  vs.  comp name.  This  type  of  issues  have  been  referred  as  the  schema- 
level  integration  problem  [24].  In  addition  to  the  schema  level  integration,  it  is 
necessary  to  perform  mapping  at  the  instance  level.  For  example,  sales  may  be 
reported  in  $100  millions,  but  revenue  in  $millions.  Furthermore,  in  a  multi- 
national environment,  financial  data  may  be  recorded  in  $,  y,  or  £  depending  on  the 
subsidiary. 

The  instance  identification  problem  becomes  critical  when  multiple 
independently  developed  and  administered  information  systems  are  involved 
because  different  identifiers  may  be  used  in  different  databases,  e.g.,  IBM  Corp  vs. 
IBM.  In  the  more  complicated  cases,  no  common  key  identifiers  are  available  for 
joining  the  data  across  databases  for  the  same  entity.  We  refer  to  these  types  of 
second-order  issues  collectively  as  logical  connectivity,  as  will  be  elaborated  later. 

CURRENT    PRACTICE 

Although  many  Composite  Information  Systems  exist  today,  most  are  in  reality, 
a  combination  of  human  operators  and  computer  systems.  In  such  a  case,  the  human 
intervention  required  to  interface  multiple  independent  databases  implies  that  it  is 
an  expensive,  time-consuming,  and  error-prone  process.  For  example,  a  major 
international  bank  [9]  has  developed  its  bank  products  (e.g.,  funds  transfer,  letter  of 
credit,  loans,  cash  management)  autonomously.  When  information  must  be 
exchanged,  it  was  often  accomplished  by  "tape  hand-offs",  usually  at  night,  as 
depicted  in  Figure  3. 
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Figure  3  Access  Multiple  Systems  Through  Human  Operators 


On  the  other  hand,  the  needs  for  integration  have  been  increasing  rapidly  both  at 
the  user  level  and  database  level.  Since  each  system  had  its  own  directly  connected 
terminals,  users  that  required  access  to  multiple  systems  had  to  have  multiple 
terminals  in  their  office,  or  walk  to  an  area  of  the  building  that  had  a  terminal  tied  to 
the  system  needed.   The  "tape  hand-ofTs"   mechanism  was  used  to  create  "shadow" 


databases  of  each  other's  real  databases  (Figure  3).    Since  the  shadow  database 
diverges  from  the  real  database  during  the  day,  inconsistencies  could  result. 

The  problem  of  integration  has  been  intensified  by  the  need  for  evolution  in  at 
least  three  areas:  current  products,  new  products,  and  new  technology.  As  the 
current  products  become  more  sophisticated,  there  is  need  to  acquire  more 
information  from  other  systems.  Increasing  "tape  hand-ofTs"  leads  to  processing 
complexities.  It  would  be  advantageous  if  the  human  operator  component  could  be 
automated  as  much  as  possible. 

To  improve  upon  these  operational  difTiculties,  more  complex  systems  have  been 
developed  that  directly  tap  into  multiple  information  resources  [16].  In  these  cases, 
the  knowledge  of  the  contents  of  these  sources  and  transformations  necessary  are 
encoded  into  customized  programs  and  are  not  captured  in  a  manner  that  allows  this 
knowledge  to  be  easily  extended  nor  used  in  new  ways.  The  key  thrust  of  the  work 
described  in  this  paper  is  the  ability  to  capture  this  knowledge  in  an  easy  way  and  be 
able  to  use  general-purpose  tools  that  are  driven  by  such  a  knowledge  base  to  enable 
rapid,  flexible,  and  extensible  development  of  powerful  composite  information 
systems. 

SOLUTION  APPROACH 

Confronted  with  these  problems,  we  have  found  it  effective  to  incorporate 
Artificial  Intelligence  (AI)  technology  as  part  of  the  solution.  AI  technology  has 
proven  to  be  very  useful  in  making  rule-based  inferences.  By  integrating  the 
information  sharing  capability  of  DBMS  technology  and  the  knowledge  processing 
power  of  AI  technology  [13],  multiple  independent  disparate  databases  may  be 
accessed  in  concert  with  minimum  human  intervention. 
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In  addressing  the  integration  of  AI  and  DBMS  technologies,  Albert,  Chern,  and 
Sears'  have  suggested  that: 

"In  the  long  term,  research  is  needed  to  find  ways  for  knowledge-based  system 
technology  to  support  database  systems  and  vice  versa.  In  the  near  term,  research  is 
needed  to  develop  tools  that  support  the  design  and  development  of  systems  that  require 
an  integrated  set  of  knowledge  base  and  database  system  tools." 

The  Tool  Kit  for  Composite  Information  Systems  (CIS/TK)  is  a  research  prototype 
being  developed  at  the  MIT  Sloan  School  of  Management  for  providing  such  an 
integrated  set  of  knowledge  base  and  database  system  tools.  It  is  implemented  in  the 
UNIX  environment  both  to  take  advantage  of  its  portability  across  disparate 
hardware  and  its  multi-programming  and  communications  capabilities  to  enable 
accessing  multiple  disparate  remote  databases  in  concert^.  CIS/TK  employs  an 
object-oriented  approach  and  rule-based  mechanism,  as  discussed  below. 

THE  CIS/TK  ENSEMBLE 

The  CIS/TK  ensemble  can  be  viewed  as  a  Knowledge  and  Information  Delivery 
System  (KIDS),  as  depicted  in  Figure  4,  which  has  four  functional  components: 
knowledge  processing,  information  processing,  physical  and  logical  connectivity,  and 
user  interfaces.  Specifically,  it  consists  of  the  following  four  subsystems: 

(1)  Knowledge  Processing 

The  knowledge  processing  subsystem  is  based  on  an  enhanced  version  of  the 
Knowledge-Object  Representation  Language  [14]  which  facilitates  an  object- 


2.     In  the  forward  of  Topics  in  Information  Systems  On  Knowledge  Base  .Management  Systems. 
Brodie  and  Mylopoulos,  ed.,  1986 


3.     The  choice  of  UNIX  portability  on  conventional  machines  reflects  our  pragmatic  philosophy. 


Figxire  4   Knowledge  and  Information  Delivery  Systems  [KIDS] 

oriented  approach  and  rule-based  inferencing  mechanism.  This  subsystem  provides 
three  benefits:  (1)  it  gives  us  the  capability  to  evolve  the  code  for  experimenting  and 
developing  innovative  concepts;  (2)  it  provides  the  required  knowledge 
representation  and  reasoning  capabilities  for  knowledge-based  processing  in  the 
heterogeneous  distributed  DBMS  environment;  and  (3)  it  is  very  simple  to  interface 
with  off-the-shelf  software  products  (e.g.,  ORACLE  and  LXFORAHX)  through  the  I/O 
redirection  and  piping  capability  inherent  in  the  UNIX  environment.  The  reader  is 
referred  to  Levine  [14]  for  a  detailed  description  of  the  knowledge  processing 
component. 

(2)  Information  Processing  &  Physical  Connectivity 

The  CIS/TK  query  processor  architecture  is  shown  in  Figure  5.  The  architecture 
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Figure  5   The  CIS/TK  Query  Processor  Architecture 
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consists  of  an  Application  Query  Processor  (AQP) ,  a  Global  Query  Processor  (GQP), 
and  a  Local  Query  Processor  (LQP)  to  interface  with  the  query  command  processor 
(e.g.  DBMS)  for  each  information  resource  in  the  CIS. 

The  AQP  converts  an  application  model  query,  defined  by  an  application 
developer,  into  a  sequence  of  global  schema  queries,  passes  them  on  to  GQP,  and 
receives  the  results. 

The  primary  query  processor  is  the  GQP.  It  converts  a  global  schema  query  into 
abstract  local  queries,  sends  them  to  the  appropriate  LQPs,  and  joins  the  results 
before  passing  them  back  to  AQP.  The  GQP  must  know  where  to  get  the  data,  how 
to  map  global  schema  attribute  names  to  column  names,  and  how  to  join  results  from 
different  tables. 

The  LQP  establishes  the  physical  connection  between  the  host  and  the 
appropriate  remote  machines  where  information  is  stored,  transforms  the  abstract 
local  query  into  the  appropriate  executable  query  commands  for  the  remote  system, 
sends  the  executable  query  commands  to  the  actual  processor,  receives  the  results, 
and  transforms  the  results  to  the  standard  GQP  format. 

The  CIS/TK  query  processor  architecture  is  the  key  to  the  integration  of  the 
disparate  information  resources  [11].  It  provides  the  platform  for  the  logical 
connectivity,  as  discussed  below. 

(3)  Logical  Connectivity 

With  the  query  processor  architecture  to  address  the  first-order  and  some  second- 
order  issues,  we  are  now  in  a  position  to  address  the  more  sophisticated  logical 
connectivity  issues;  for  example,  how  to  resolve  the  conflict  of  incompatible 
information,  concept  inferencing,  and    identification  of  the  same  instance  across 
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multiple  information  resources     [24].     We  illustrate  the  process  of  instance 
identification  below. 


Database  #  1  (Created  by:  Rich,  Instructor  for  564  and  579) 
Name*  564      579      Sec564      Age     Perform         Address 


Jane  Murphy 


Yes      Yes      A.M. 


19       Strong 


Marblehead 


Database  #2  (Created  by:  Dave,  head  TA  for  564) 

Nickname* 

Sec564 
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Sex 

Major  Status 

Trans 
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A.M. 
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MIS 

UG 

car 

sharp  cookie 
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discard 
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•   a    • 
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MIS 
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walk 

routine 

Figure  6    Student  Databases  Without  Common  Key  Identifier 

Suppose  that  Rich,  the  instructor  for  Management  Information  Technology  (MIS 
564)  and  Communication  and  Connectivity  (MIS  579),  has  a  database  of  students 
who  take  564  and  579;  while  Dave,  the  head  teaching  assistant  for  564,  has  a 
database  for  the  564  students.  For  some  reason,  Rich  would  like  to  know  Dave's 
opinion  about  Jane  Murphy,  an  instance  in  his  student  database. 

As  Figure  6  shows,  the  two  databases  that  Rich  and  Dave  have  do  not  share  a 
common  key  identifier  for  joining  the  data.  Under  this  circumstance,  the 
conventional  database  join  technique  is  not  applicable.  The  current  practice 
requires  human  interaction  in  order  to  identify  the  same  instance  across  databases, 
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i.e.,  matching  Jane  Murphy  from  Rich's  database  with  one  of  the  students  in  Dave's 
database. 

A  moment  of  sharp  observation  would  lead  one  to  conclude  that  the  human 
interaction  involves  the  process  of  subsetting  through  common  attributes  to 
eliminate  the  unrelated  candidate  students  followed  by  some  heuristics  to  draw  a 
conclusion.  There  are  two  common  attributes  in  the  two  database,  i.e.,  sec564  and 
performance.  By  applying  these  two  attributes,  the  candidate  students  that 
correspond  to  Jane  are  reduced  from  the  entire  database  to  5  (i.e.,  those  who  attend 
the  A.M.  section  of  564  with  strong  performance,  as  shown  in  the  first  five  rows  of 
Dave's  database.) 

Using  the  other  attributes  in  these  databases,  plus  auxiliary  databases  and 
inferencing  rules,  one  may  come  to  the  conclusion  that  Jane  Murphy  is  "Happy" 
The  logic  goes  as  follows: 

•  Jane  is  19  years  old;  therefore,  the  status  is  most  likely  "UG"  (undergraduate) 
[this  eliminates  "Doc"]. 

•  Assuming  the  availability  of  a  database  of  typical  male  and  female  names,  we 
can  conclude  that  Jane  Murphy  is  a  female  [this  eliminates  "Sleepy"]. 

•  Jane  lives  in  Marblehead.  Assuming  a  distance  database  of  locations  of  New 
England  exists,  we  determine  that  Marblehead  is  27  miles  from  Cambridge  and 
therefore,  it  is  unlikely  that  the  transportation  type  is  bike  [this  eliminates 
"Dopey"]. 

•  Jane  takes  564  and  579  which  are  the  core  courses  for  MIS  major;  therefore,  it  is 
more  logical  to  conclude  that  Jane  Murphy  is  majoring  in  MIS  [this  eliminates 
"Sneezy"]. 
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Therefore,  Jane  Murphy  is  "Happy"  who  is  a  sharp  cookie.  Note  that  this 
analysis  requires  a  combination  of  database  and  AI  techniques. 

(4)  Interface  Tools 

To  facilitate  the  development  of  such  systems,  a  set  of  tools  for  building  global 
schemata,  application  models,  and  application  model  queries  are  currently  being 
developed  [24].  The  Global  Schema  Builder  assists  in  the  construction  of  a  consistent 
global  schema  model.  The  Application  Model  Builder  assists  in  the  creation  of  an 
application  model.  Finally,  the  User  Query  Builder  extends  the  functionality  of  an 
application  model  by  assisting  in  the  definition  of  new  queries. 

The  Global  Schema  Builder  supports  the  following  steps:  (1)  representing  each 
local  database  as  a  local  component  model,  which  directly  represents  the  entities  and 
attributes  of  that  database;  (2)  extending  the  local  component  model  by  creating 
objects  which  logically  represent  those  of  the  domain;  and  (3)  creating  a  global 
integrated  model  which  combines  the  objects  from  the  local  integrated  models  [8]. 

Also  output  from  the  Global  Schema  Builder  are  tables  which  are  used  to  unify 
the  local  information  resources:  (1)  Inter-Database  Tables  to  specify  one-to-one 
mappings  between  items  in  multiple  local  information  resources;  (2)  Inter-Database 
Concept  Grouping  Tables  to  specify  how  concepts  are  related  hierarchically;  and  (3) 
an  Inter-Database  Instance  Identifier  Tables  to  group  the  local  join  keys  that  are 
associated  with  a  unique  global  instance. 
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Prototype  Status 

Taken  together,  the  above  four  subsystems  comprise  a  KIDS  for  delivering 
timely  knowledge  and  information  in  a  diversity  of  situations.  The  primary  design 
goals  of  CIS/TK  is  to  move  responsibility  from  the  user  to  the  system  in  the  following 
three  areas:  (1)  physical  connection  to  remote  databases;  (2)  DB  navigation, 
attribute  mapping,  etc.;  and  (3)  more  advanced  logical  connectivity  issues. 

An  operational  prototype  has  been  developed  at  the  MIT  Sloan  School  of 
Management,  as  exemplified  in  Figure  6.  Currently,  the  system  allows  for 
simultaneous  access  to  relational  databases  in  multiple  machines  (an  IBM  PC/TIT 
and  two  AT&T  3B2  computers)  via  the  LQPs.  The  LQPs  are  implemented  as  objects 
with  a  standard  set  of  protocols.  Physical  communication  details  and  database 
idiosyncrasies  are  encapsulated  within  the  object.  The  advantage  of  this  approach  is 
the  extendability.  For  instance,  LQP's  are  currently  under  construction  to  interface 
the  GQP  with  SQL/DS  on  an  IBM  4341  accessible  through  a  Ethernet  LAN  and  with 
Reuter's  Textline  databases  and  EP  Sharps'  disclosure  database  which  are  neither 
relational  nor  accessible  through  local  networks.  Without  modifying  the  existing 
LQPs  or  GQP,  the  CIS/TK  architecture  will  accommodate  the  new  LQP  easily.  We 
now  turn  our  attention  to  the  applicability  of  CIS-TK  in  a  diversity  of  situations. 

APPLICATION    DOMAINS 

Four  categories  of  situations  where  a  composite  information  system  can  be 
strategically  advantageous  are  summarized  below: 
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Figure  6    An  Operational  CIS/TK  Prototype 

(1)  Inter-organizational  -  which  involve  two  or  more  separate  organizations  (e.g., 
direct  connection  between  production  planning  system  in  one  company  and  order 
entry  system  in  another  company). 

(2)  Inter-divisional  -  which  involves  two  or  more  divisions  within  a  firm  (e.g., 
corporate-wide  coordinated  purchasing). 
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(3)  Inter-product  -  which  involves  the  development  of  sophisticated  information 
services  by  combining  simpler  services  (e.g.,  a  cash  management  account  that 
combines  brokerage  services,  checks,  credit  card,  and  savings  account  features). 

(4)  Inter-model  -  which  involves  combining  separate  models  to  make  more 
comprehensive  models  (e.g.,  combine  economic  forecasting  model  with  optimal 
distribution  model  to  analyze  the  impact  of  economic  conditions  on  distribution). 

As  mentioned  earlier  in  the  PAS  application  (Figure  1),  in  order  to  find 
companies  interviewing  at  Sloan  that  are  auto  manufacturers,  alumni/students  from 
these  companies,  and  recent  information  about  these  companies,  all  the  five 
databases  in  the  four  organizations  need  to  be  accessed.  CIS/TK  can  be  applied  to 
facilitate  this  process  through  its  query  processor  architecture.  Moreover,  its 
knowledge  processing  component  can  be  employed  to  perform  complex  heuristic 
reasoning. 

The  versatility  of  CIS/TK  for  a  diversity  of  applications  reflects  our  research 
perspective.  A  comparison  of  CIS/TK  with  some  related  state-of-the-art  systems  [1, 
6,  7,  8,  10,  15,17,  18,  23]  will  clarify  the  point,  as  discussed  below. 

The  capability  of  the  CIS/TK  ensemble  include:  query  processing  facility,  local 
database  access,  multiple  remote  database  access  in  concert,  rules  and  objects, 
instance  identification  facility,  semantic  reconciliation  facility,  global  schema 
builder,  application  model  builder,  and  user  query  builder. 

Each  feature  per  se  is  interesting.  However,  the  major  benefit  comes  from  the 
capability  of  the  holistic  ensemble,  as  a  result  of  the  interfaces  and  tools  built  into 
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the  tool  kit,  to  facilitate  the  delivery  of  timely  knowledge  and  information  to  the  end- 
user. 

We  compare  CIS/TK  with  some  representative  products  or  prototypes  from  the 
related  research  areas:  Schema  Integration  [1,  8],  Distributed  DBMSs  [6,  7,  10,  15], 
and  Object-Oriented  Databases  [17,  18,  23]".  Note  that  none  of  the  other  systems 
were  specifically  designed  for  timely  delivery  of  knowledge  and  information  which 
require  multiple  independent  disparate  databases  to  work  together  within  and'or 
across  organizational  boundaries,  as  Table  1  manifests.  Therefore,  it  is  not 
surprising  that  no  single  system  includes  the  comprehensive  set  of  capabilities 
incorporated  in  CIS/TK  which  are  required  to  accomplish  that  goal. 

CONCLUDING     REMARKS 

The  CIS/TK  ensemble  is  a  unique  and  innovative  system  for  delivering  timely 
knowledge  and  information  in  an  inter-organizational  setting.  In  the  rapidly 
changing,  complex,  and  competitive  global  market,  the  capability  to  dynamically 
align  corporate  strategy  with  information  technology  in  the  organizational  context  is 
a  critical  issue  facing  the  executive.  CIS/TK  is  aimed  at  providing  such  a  dynamic  IT 
platform  for  supporting  knowledge  and  information  intensive  applications. 

Our  focus  is  on  real,  nontrivial,  and  exciting  problems  challenging  today  and 
tomorrow's  IS  executives.  The  operational  prototype  we  have  implemented  clearly 
demonstrates  the  feasibility  of  such  an  innovative  concept.  In  the  near  term,  we  plan 
to  extend  the  system  through  the  following  tasks:  (1)  design  and  implement  facilities 
for  credibility  analysis,  conflict  resolution,  and  further  concept  inferencing;  (2) 


4.     Also  private  communications  with  researchers  at  CCA  on  MULTIBASE  and  PROBE,  and 
Ontologic,  Inc.  on  Vbase., 
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develop  more  efficient  local  query  processors  and  communication  servers;  and  (3) 
demonstrate  the  feasibility  of  CIS/TK  to  interface  with  geographically  disparate  ofT- 
the-shelf  products  or  customized  systems,  such  as  Reuter's  Dataline,  Textline,  and 
Newsline  databases.  We  believe  that  this  effort  will  not  only  contribute  to  the 
academic  research  frontier  but  also  benefit  the  business  community  in  the 
foreseeable  future. 
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